Evaluation of French and English MeSH Indexing Systems with a Parallel Corpus
نویسندگان
چکیده
OBJECTIVE This paper presents the evaluation of two MeSH indexing systems for French and English on a parallel corpus. MATERIAL AND METHODS We describe two automatic MeSH in-dexing systems - MTI for English, and MAIF for French. The French version of the evaluation resources has been manually indexed with MeSH keyword/qualifier pairs. This professional indexing is used as our gold standard in the evaluation of both systems on keyword retrieval. RESULTS The English system (MTI) obtains significantly better precision and recall (78% precision and 21% recall at rank 1, vs. 37%. precision and 6% recall for MAIF ). Moreover, the performance of both systems can be optimised by the break-age function used by the French system (MAIF), which selects an adaptive number of descriptors for each resource indexed. CONCLUSION MTI achieves better performance. However, both systems have features that can benefit each other.
منابع مشابه
Evaluation of a Simple Method for the Automatic Assignment of MeSH Descriptors to Health Resources in a French Online Catalogue
BACKGROUND The growing number of resources to be indexed in the catalogue of online health resources in French (CISMeF) calls for curating strategies involving automatic indexing tools while maintaining the catalogue's high indexing quality standards. OBJECTIVE To develop a simple automatic tool that retrieves MeSH descriptors from documents titles. METHODS In parallel to research on advanc...
متن کاملUsing Word Alignment to Extend Multilingual Medical Terminologies
Medical terminologies such as those provided in the UMLS are never exhaustive and there is a constant need to enrich them, especially in terms of multilinguality. We present a methodology to acquire new French translations of English medical terms based on word alignment in a parallel corpus — i.e. pairing of corresponding words. We automatically collected a 27.7-million-word parallel, English-...
متن کاملEvaluation of Cross-Language Information Retrieval Using the Domain-Specific GIRT Data as Parallel German-English Corpus
The development of the evaluation of domain-specific cross-language information retrieval (CLIR) is shown in the context of the Cross-Language Evaluation Forum (CLEF) campaigns from 2000 to 2003. The pre-conditions and the usable data and additionally available instruments are described. The main goals of this task of CLEF are to allow the evaluation of Cross-Language Information Retrieval (CLI...
متن کاملUniNE at CLEF
As participants in this CLEF evaluation campaign, our first objective is to propose and evaluate various indexing and search strategies for the CHiC corpus, in order to compare the retrieval effectiveness across different IR models. Our second objective is to measure the relative merit of various stemming strategies when used for the French and English monolingual task in the CH context. Our th...
متن کاملGrepator: Accents & Case Mix for Thesaurus
There is a real need among researchers and students for pedagogical resources. In France, information retrieval techniques have been developed, for example in the Doc'CISMeF web site. As Pubmed, documents are indexed with (French) MeSH terms, one of the problems discovered, in quality studies, is the inadequacies between the user requests and the MeSH controlled vocabulary. Moreover, French (bu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- AMIA ... Annual Symposium proceedings. AMIA Symposium
دوره شماره
صفحات -
تاریخ انتشار 2005